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MEMORY SYSTEM HAVING POINT-TO-POINT BUS CONFIGURATION 

RELATED APPLICATIONS 

This application claims the benefit of United States Provisional Patent Application Serial 
Number 60/273,890, filed March 6, 2001. 

BACKGROUND OF THE INVENTION 

Memory systems are often times arranged in a stub architecture. In such an architecture, 
memory modules are arranged in parallel as stubs along a common data bus, control/address bus, 
and clock bus. In order to increase data transmission rates in a memory system having a stub bus 
architecture, careful control over signal integrity is necessary; signal integrity in turn being 
affected by the stub load. A stub load behaves on a transmission line as a discontinuous point, 
which results in signal reflection. Signal reflection due to the stub load deteriorates signal 
integrity, thereby limiting the overall data transmission rate of the system. 

Attempts have been made to suppress the detrimental affect of a stub load by configuring 
the stub bus according to a stub-series-terminated-logic (SSTL) architecture. However this 
configuration has a fundamental limit in increasing the data transmission rate because, although 
the adverse effects of the stub load are mitigated, the load is still included in the configuration. 

To overcome the limitations encountered by the stub bus architecture, a short-loop-through 
(SLT) structure has been proposed. In the SLT bus structure, system components are arranged in 
series on a signal line. In the case of a memory module, for example, the signal line extends along 
the motherboard through a module connector to a first side of the module and on to a desired 
component on the module. The signal line then passes through the module body to a second 
component on a second face of the module and returns to the motherboard through a second 
coupling on the module connector. From the first module connector, the signal line extends on the 
motherboard to a second module connector, to the second module, and so on. Therefore, in the 
SLT bus structure, there are no discontinuous points due to stub loads, such that signal integrity is 
enhanced and data transmission rate can therefore be increased. However since two pins are 
required for each signal, the resulting number of module pins is double the number required by the 
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stub bus structure, which increases system costs. Moreover, the loading of a signal line increases 
as the number of modules increases, which limits the maximum operable data transmission rate. 

To address the limitations encountered in the SLT bus structure, a point-to-point bus 
structure has been proposed. For example, United States Patent Number 5,742,840, to Hansen, et 
5 al proposes such a structure in FIG. 13. In the point-to-point bus structure, only a single load is 
driven by a single source, and a discontinuous point such as a stub, does not exist. In this manner, 
the data transmission rate can be considerably increased. As data is passed from module to 
module, a complicated clocking scheme is required, as each data transfer between modules may 
have its own phase relationship and therefore the phase relationship of the clock signals in the read 
10 direction and write direction may be different, depending on module position. 

□ SUMMARY OF THE INVENTION 

.J The present invention is directed to a clocking system and method in a point-to-point bus 

O structure that overcomes the limitations of the conventional approaches. In one embodiment, the 

«lp present invention ensures the same phase relationship for the write clock in the write direction for 

* all data transfers between modules, and similarly the same phase relationship for the read clock in 

Sisal 

f|l the read direction for all data transfers between modules, regardless of module location. In another 

ft! 

embodiment, on a given module, all transfers of data between a data buffer and a memory device 
in both read and write directions are clocked by a read clock signal and a write clock signal that 
20 have the same phase relationship and have the same propagation delay as the data bus between the 
buffer and the memory device. 

In one aspect the present invention is directed to a memory module for use in a memory 
system having a point-to-point bus configuration, the memory module includes a memory device 
and a buffer, the buffer receiving a first write clock signal and a control signal that includes a read 
25 or write command in a first direction of transmission, the buffer receiving a first read clock signal 
in a second direction of transmission, the buffer being coupled to a first bidirectional data bus and 
a second bidirectional data bus. The memory module generates a second write clock signal in 
response to the first write clock signal for transmitting data from the buffer in the first direction of 
transmission if the write command indicates that data is to be written to another memory module 
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in the system, and further generates a memory write clock signal in response to the first write clock 
signal for writing data from the buffer to the memory if the write command indicates that data is to 
be written to the memory in the module. The memory module further generates a memory read 
clock signal in response to the first write clock signal for reading data from the memory to the 
5 buffer if the read command indicates that data is to be read from the memory in the module. 

The memory module may further generate a second read clock signal in response to the 
first write clock signal for transmitting data from the buffer in the second direction of transmission 
if the read command indicates that data is to be read from another memory module in the system. 
The memory read clock signal preferably comprises a returned signal of the memory write 
10 clock signal, in which case, the memory read clock signal is generated on a transmission path that 
is coupled to a transmission path of the memory write clock signal. A dummy load may be 
coupled to the transmission path of the memory read clock signal and the memory write clock 
signal. The transmission path length of the memory read clock signal and the transmission path 
length of the memory write clock signal are preferably equal to the transmission path length of the 
data signals between the memory and the buffer. 

The second write clock signal, the second read clock signal, the memory write clock signal, 

ft! and the memory read clock signal are preferably generated in response to the first write clock 

j u 

f| signal such that the generated signals are in phase with the first write clock signal, for example by 
W a phase locked loop or delay locked loop. 

20 In another aspect, the present invention is directed to a memory module for use in a 

memory system having a point-to-point bus configuration. The memory module includes a 
memory device and a buffer, the buffer receiving a first write clock signal and a control signal that 
includes a read or write command in a first direction of transmission, the buffer receiving a first 
read clock signal in a second direction of transmission, the buffer being coupled to a first 

25 bidirectional data bus and a second bidirectional data bus. 

The memory module generates a second write clock signal in response to the first write 
clock signal for transmitting data from the buffer in the first direction of transmission if the write 
command indicates that data is to be written to another memory module in the system. 

The memory module generates a memory write clock signal in response to the first write 
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clock signal for writing data from the buffer to the memory if the write command indicates that 
data is to be written to the memory in the module. 

The memory module generates a memory read clock signal in response to the first write 
clock signal for reading data from the memory to the buffer if the read command indicates that data 
is to be read from the memory in the module. 

The memory module generates a second read clock signal in response to the first write 
clock signal for transmitting data from the buffer in the second direction of transmission if the read 
command indicates that data is to be read from another memory module in the system. 

In another aspect, the present invention is directed to a memory system having a 
point-to-point bus configuration. The system includes a memory controller for generating a first 
write clock signal and a control signal that includes a read or write command; and a memory 
module including a memory device and a buffer, the buffer receiving the first write clock signal 
and the control signal in a first direction of transmission, the buffer receiving a first read clock 
signal in a second direction of transmission, the buffer being coupled to a first bidirectional data 
bus and a second bidirectional data bus. The memory module generates a second write clock 
signal in response to the first write clock signal for transmitting data from the buffer in the first 
direction of transmission if the write command indicates that data is to be written to another 
memory module in the system, and generates a memory write clock signal in response to the first 
write clock signal for writing data from the buffer to the memory if the write command indicates 
that data is to be written to the memory in the module. The memory module further generates a 
memory read clock signal in response to the first write clock signal for reading data from the 
memory to the buffer if the read command indicates that data is to be read from the memory in the 
module. 

In another aspect, the present invention is directed to a memory system having a 
point-to-point bus configuration. The system comprises a memory controller for generating a first 
write clock signal and a control signal that includes a read or write command and a read clock 
generator for generating a first read clock signal. A memory module includes a memory device 
and a buffer, the buffer receiving the first write clock signal and the control signal in a first 
direction of transmission, the buffer receiving the first read clock signal in a second direction of 
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transmission, the buffer being coupled to a first bidirectional data bus and a second bidirectional 
data bus. The memory module generates a second write clock signal in response to the first write 
clock signal for transmitting data from the buffer in the first direction of transmission if the write 
command indicates that data is to be written to another memory module in the system, and 
5 generates a memory write clock signal in response to the first write clock signal for writing data 
from the buffer to the memory if the write command indicates that data is to be written to the 
memory in the module. The memory module generates a memory read clock signal in response to 
the first write clock signal for reading data from the memory to the buffer if the read command 
indicates that data is to be read from the memory in the module; and generates a second read clock 
10 signal in response to the first read clock signal for transmitting data from the buffer in the second 
M* direction of transmission. 

13 

*jj BRIEF DESCRIPTION OF THE DRAWINGS 

Q The foregoing and other objects, features and advantages of the invention will be apparent 

Jlj5 from the more particular description of preferred embodiments of the invention, as illustrated in 
*■ the accompanying drawings in which like reference characters refer to the same parts throughout 
f| j the different views. The drawings are not necessarily to scale, emphasis instead being placed upon 
^ illustrating the principles of the invention. 

Q FIG. 1 is a schematic block diagram of a point-to-point memory system in accordance with 

20 the present invention. 

FIG. 2 is a schematic block diagram illustrating clock signals that are passed in conjunction 
with the data between a data buffer and memory devices of a memory module for the clocking 
technique according to the present invention. 

FIG. 3 illustrates the generation of the module read clock RCLKMDL signal at a memory 
25 device, by returning the received module write clock WCLKJVIDL signal, for the clocking of data 
transferred between a data buffer and a memory device in accordance with the present invention. 

FIG. 4 is a schematic block diagram of a read operation in which the output read clock 
RCLK_OUT is generated in response to the input write clock WCLK_IN in accordance with the 
present invention. 
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FIG. 5 is a schematic block diagram of a write operation in which the output write clock 
WCLK_OUT is generated in response to the input write clock WCLK_IN, in accordance with the 
present invention. 

FIG. 6 is a schematic block diagram of a second embodiment of the present invention in 
which the read clock RCLK is generated by an external read clock generator 50. 

FIG. 7 is a schematic block diagram illustrating generation of the output read clock 
RCLKOUT in response to the input read clock RCLKJN, and generation of the output write 
clock WCLK_OUT in response to the input write clock WCLKIN, in accordance with the present 
invention. 

FIG. 8 is a schematic block diagram illustrating generation of the module read clock 
RCLK_MDL signal by coupling the module write clock WCLK_MDL to a dummy load in 
accordance with the present invention. 

FIG. 9 is a schematic block diagram illustrating generation of the module read clock 
RCLK_MDL by a phase locked loop or delay locked loop in response to the module write clock 
WCLKJV1DL, in accordance with the present invention. 

DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS 

FIG. 1 is a schematic block diagram of a memory system according to the present 
invention. The memory system includes a memory controller 40, a plurality of memory modules 
42 A, 42B. A number of signal lines 56, for example mounted on a motherboard transfer signals 
between the memory controller 40 and the various modules 42 A, 42B. 

Each memory module 42A, 42B includes a data buffer 48, a command/address signal 
buffer 46, and a plurality of memory devices 44. In one example, the memory devices 44 may 
comprise dynamic random access memory (DRAM) devices. The data buffer 48 manages the 
buffering of data signals on the data bus DQ, and transfers the data in response to a write clock 
signal WCLK and a read clock signal RCLK, among others. The command/address buffer 46 
manages the buffering of command signals, address signals, and flag signals, and controls the 
data buffer 48 and the memory devices 44 in accordance with the command, address, and flag 
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signals. During a write operation, the data buffer 48 transfers buffered data to the memory devices 
44, while during a read operation, the data buffer 48 receives data from the memory devices 44. 
While only two memory modules, namely 42 A, 42B are shown in the exemplary illustration of 
FIG. 1, it is understood that additional memory modules can be added to the system in like manner. 

In the point-to-point system architecture of the present invention, the data bus DQ is 
transferred on an independent line from the memory controller 40 to the data buffer 48 of the first 
memory module 42A. Similarly, the write clock signal WCLK is passed from the memory 
controller 40 to the data buffer 48 as well as the command/address buffer 46 of the first memory 
module on an independent line. The read clock RCLK is received by the memory controller 40 
from the data buffer 48 of the first memory module 42A on an independent line. Also, the 
command/address C/A and DFLAG signals are transferred to the command/address buffer 46 of 
the first memory module from the memory controller 40 on an independent line, and the RFLAG 
signal is received by the memory controller from the command/address buffer 46 of the first 
memory module 42A on an independent line. 

Signals are similarly transferred between the first memory module 42A and the second 
memory module 42B on signal lines DQ1, WCLK1, RCLK1, C/A&DFLAG1, and RFLAG 1 that 
are independent of the signal lines for passing signals between the memory controller 40 and the 
first memory module 42 A. Another set of signal lines DQ2, WCLK2, RCLK2, C/A&DFLAG2, 
and RFLAG2 transfer signals between the second memory module 42b and a third memory 
module (not shown), and so on. As explained above, in the point-to-point bus structure, only a 
single load is driven by a single signal source, and therefore the addition of further memory 
modules does not impart an additional load on the signal lines. 

As described above, data are exchanged between the memory controller 40 and the first 
and second memory modules 42A, 42B on a local, independent data bus DQ. A write clock 
WCLK is generated by the memory controller 40 and is transmitted to the data buffer 48 and 
command/address buffer 46 of the first memory module 42 A, as a reference for the transfer of data 
DQ from the memory controller 40 to the first memory first module 42A in synchronization with 
the rising and falling edges of the write clock WCLK. Similarly, the command/address signals 
(C/A) are transferred to the first memory module 42A from the memory controller 40 in 
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synchronization with the write clock signal WCLK. In this manner, the write clock signal WCLK, 
as received by the data buffer 48 is used to sample the data received on the data bus DQ by the data 
buffer 48, while the same write clock signal WCLK, as received by the command/address buffer 
46 is used to sample the command/address signals received on the command/address bus C/A by 
5 the command/address buffer 46. 

Upon receiving a command /address C/A signal, the command /address buffer 46 of the 
first memory module buffers the received command/address C/A signal and then transmits the 
buffered command/address C/A signal to the memory devices 44 of the first memory module 42A, 
and simultaneously transmits via signal 45 the command/address C/A signal to the 
10 command/address buffer 46 of the second memory module 42B. The command/address buffer 46 
^ of each module 42A, 42B functions primarily to transmit the input command/address signal to 
p each memory device 44 hosted on the module 42A, 42B and to the command/address buffer on the 

J adjacent module, and also functions to perform a minimal level of command/address decoding for 

Mm 

f3 transmitting a decoding signal 47 that notifies the corresponding data buffer 48 on that module of 

i.fl 

i|5 the input/output direction of the data signals DQ. In other words, the command/address buffer 
L notifies the data buffer 48 as to whether the data signals DQ present in the data buffer 48 are to be 
fit transmitted to the memory devices 44 in the local module, or to memory devices 44 in another 
In module in the system, or to the memory controller 40. 

W In traditional memory systems, it is common for the data bus DQ to operate at a rate that is 

I'M 

20 two times faster than the command/address C/A bus. For this reason, control commands are 

provided to the memory modules 42A, 42B in advance of the data so that the memory devices on 
the module have sufficient time to prepare for the data read or data write operation. The latency 
between the command and data signals is commonly referred to as column address strobe (CAS) 
latency. With reference to FIG. 1, an optional data flag DFLAG signal, generated by the memory 

25 controller 40, provides the CAS latency information for both read and write operations to the 
memory modules 42 A, 42B. The C/A buffer 46 A, 46B receives the DFLAG signal from the 
memory controller 40 and outputs a localized data flag signal to each memory device 44 on the 
module 42A, 42B via buffered DFLAG_MDL signal 45. Upon sensing a transition in the DFLAG 
signal, each memory device 44 on the module 42A, 42B outputs read or write data on the data bus 
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DQ following a predetermined time interval. The DFLAG signal is received by the 
command/address buffer 46 in synchronization with the write clock WCLK signal The DFLAG 
signal will experience the same propagation delay as the WCLK in the direction of propagation 
between the memory controller 40 and the memory modules 42 A, 42B. 
5 The C/A buffer 46 may optionally generate a return flag signal RFLAG for the return path 

in response to the DFLAG signal The optional RFLAG signal may be needed in cases where 
there is a phase difference between the read clock RCLK, which is synchronized with the read data 
DQ, and the DFLAG signal generated by the controller. If it is possible for the memory controller 
to compensate for the phase difference, the RFLAG signal can be eliminated. The RFLAG signal 
10 carries timing information related to when read data DQ output by the memory devices 44 will 
arrive at the memory controller 40. While the memory controller 40 can receive valid data 
transferred from the memory module 42A in synchronization with the read clock signal RCLK 
transferred from the memory module 42A, it is possible for the memory controller to receive 
invalid data from the memory module 42A, should the time difference between the WCLK and 
U|5 RCLK signals at the controller be greater than one clock cycle. The RFLAG signal ensures that 
JL valid data is received by the memory controller 40 at all times, and as such, the memory controller 

fit 40 receives the data in response to the read flag signal RFLAG and read clock signal RCLK 

ftJ 

*2J transferred from the first module 42A. 

P Accordingly, the memory controller 40 recognizes the read data DQ arrival time via the 

20 RFLAG signal output by the C/A buffer 46 A. The RFLAG signal preferably has the same 
propagation delay time as the read data DQ signals as the line on which the RFLAG signal is 
transported is preferably configured to be routed with, and therefore have the same propagation 
delay as the return clock RCLK and data bus DQ signals. 

The data buffer 48 receives or transmits data according to whether a write operation or a 
25 read operation is to be performed. In the case of a write operation, the data buffer 48 receives data 
signals DQ transmitted from the memory controller 40 in synchronization with the write clock 
signal WCLK output by the controller 40. The data buffer 48 then determines whether to transmit 
the data signals DQ to the memory devices 44 mounted on the local module on the basis of the 
control/address decoding signal 47 generated by the control/address buffer 46. With reference to 
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FIG. 2, assuming that data is to be written to a memory device 44 local to the module 42, the data 
buffer 48 generates a module write clock WCLK_MDL based on the input write clock signal 
WCLK_IN and transmits the data signals DQ to the memory devices 44 in synchronization with 
the module write clock signal WCLK__MDL. In a preferred embodiment, the module write clock 
5 signal is generated based on the input write clock signal WCLK_IN, such that the two signals are 
in phase with each other. 

In the case of a data read operation, the data buffer 48 receives read data DQ in 
synchronization with a module read clock signal RCLKJVIDL that is generated based on the 
module write clock signal WCLK_MDL received by the memory devices 44. . Next, with 
10 reference to FIG. 1 and FIG. 2, the data buffer 48 outputs the buffered read data DQ to the memory 
H 5 controller 40 in synchronization with the read clock signal RCLKOUT generated by the first 
p module 42A based on the input write clock signal WCLK_IN. Alternatively, in the case of a 
? second module 42B, the data buffer 48 outputs the read data DQ to the data buffer 48 of the 
C5 adjacent module 42 A in synchronization with an output read clock RCLK_OUT signal generated 
4| based on the received write clock WCLKIN signal. 

|„ FIG. 3 is a schematic block diagram illustrating the interaction of the module read clock 

fit RCLKJVIDL and module write clock WCLKMDL signals used for transferring data DQ 

ft] 

pl between the data buffer 48 and memory devices 44 of a given memory module 42A, 42B. As 
y explained above, data is written from the data buffer 48 to the memory device 44 in 
20 synchronization with the module write clock WCLK_MDL. Similarly, data is read from the 
memory device 44 to the data buffer 48 in synchronization with the module read clock 
RCLK_MDL. The module write clock signal WCLKJV1DL line and the module read clock signal 
line RCLKJVIDL are preferably routed with the data bus lines DQ on the memory module 
between the data buffer 48 and the memory device 44 such that the clock signals WCLKMDL, 
25 RCLK_MDL and the data signals DQ experience the same propagation delay. In this manner, the 
transmitted data and clock signals will arrive simultaneously at the receiving unit, and therefore 
the received clock signal can be used to clock the data signals with precision. 

In a preferred embodiment of the present invention, as shown in FIG. 3, the line on which 
the module read clock RCLKJVIDL is transferred may be coupled at the memory device 44 to the 
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line on which the module write clock WCLK_MDL is transferred. In this manner, the module read 
clock signal RCLKMDL is returned to the data buffer 48 in order to sample read data DQ output 
from each memory device 44. As shown in FIG. 2, in this embodiment, a number of module read 
clock RCLK_MDL signals are generated by each memory device 44 in response to each module 
5 write clock signal WCLKMDL. 

In an alternative embodiment illustrated in FIG. 8, a single module read clock signal 
RCLK_MDL is returned to the data buffer 48 in response to multiple module write clock signals 
WCLK MDL. As shown in FIG. 8, each of the four memory devices 44 receives a corresponding 
module write clock signal WCLKJVIDL. However, a fifth module write clock signal 
10 WCLKJMDL is also generated, and tied to a dummy load 52. The length of the line of the module 
l f write clock signal WCLK_MDL tied to the dummy load 52 is configured to match that of the 

module write clock signals WCLKJVIDL tied to actual memory devices 44. A module read clock 
RCLKJMDL line is also tied to the dummy load 52 and returns to the data buffer 48. The length 
of the line of the module read clock signal RCLK_MDL is configured to match the path length of 
*3jp the data bus DQ between the memory devices 44 and the data buffer 48. The dummy load 52 is 
preferably configured to have a capacitance that matches that of the clock pin of a memory device 
44 receiving the module write clock signal WCLKMDL. In this manner, the dummy load 52 
loads the WCLK_MDL signal as though it were a memory device, while reducing the number of 
clock pins required by the data buffer 48. 
20 In a second alternative embodiment illustrated in FIG. 9, a single module read clock signal 

RCLK_MDL may be generated by a phase locked loop PLL (or delay locked loop DLL) in 
response to the module write clock signal WCLKJVIDL. As shown in FIG. 9, each of the four 
memory devices 44 receives a corresponding module write clock signal WCLKJVIDL. A fifth 
module write clock signal WCLK MDL is also generated, and is, in this case, generated by a 
25 phase locked loop PLL (or delay locked loop DLL) 54, that returns a module read clock 

RCLKMDL signal in response to the received module write clock WCLKJVIDL signal. Phase 
locked loops and delay locked loops are well-known mechanisms for ensuring that an output 
signal is generated so that the transition edges of the output signal are aligned to those of an input 
signal; namely, the transition edges of the RCLKJMDL signal are aligned with those of the 
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WCLK_MDL signal. In the case of a phase locked loop (PLL), the phase of a voltage controlled 
oscillator is controlled until the clock edge of the output RCLK_MDL signal is aligned to that of 
the input WCLK_MDL signal. In the case of a delay locked loop (DLL) the input signal 
WCLK MDL is applied to a variable delay line, the delay of which is controlled until the clock 
5 edge of the output signal RCLKJMDL is aligned with that of the input signal WCLK_MDL. 

FIG. 4 is a schematic block diagram of a read operation in which the output read clock 
RCLKJ3UT is generated in response to, or based on, the input write clock WCLK_IN. In this 
example, the first module 42A receives a write clock WCLK referred to herein as an input write 
clock WCLKJN, for example from a memory controller 40 or adjacent memory module. The 
1 0 memory module 42 A in turn generates an output write clock WCLK_OUT that is transferred to the 
second memory module 42B. The output write clock WCLK_OUT is generated based on the input 
write clock WCLKJN and is in phase therewith. As an example embodiment of generating an 
in-phase output write clock signal WCLK_OUT based on the input write clock signal WCLK_IN, 
the output write clock WCLK_OUT signal can be generated as the output of a PLL or DLL that 
receives, as an input, the input write clock signal WCLK_IN. 

Similarly, an output read clock RCLK_OUT is generated by the first memory module 42A, 
fit in response to the input write clock WCLKJN signal. The output read clock is transferred to the 

'fu 

memory controller 40, or an adjacent memory module for the transfer of data DQ in the read 
S direction. A module write clock signal WCLK MDL is also generated in response to the received 
20 input write clock signal WCLK_IN, as described above, for clocking the internal transfer of data 
between the data buffer 48 and the memory devices 44. The data buffer 48 of the first memory 
module 42A further receives an input read clock RCLKJN that is generated by the second 
memory module 42B to sample the read data DQ transferred from the second memory module 
42B. That is, the data buffer 48 of the first memory module 42 A receives the read data DQ 
25 transferred from the second memory module 42B in synchronization with the input read clock 
RCLK_IN generated and output as signal RCLKOUT by the second memory module 42B. 

The output write clock WCLK__OUT of the first memory module 42A is transferred to a 
second memory module 42B and received as an input write clock WCLKJN at the second 
memory module 42B. The second memory module 42B generates an output write clock 
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WCLK_OUT and an output read clock RCLK_OUT in response to the received input write clock 
WCLK_IN signal, in a manner similar to a first memory module 42A. Similarly, an internal 
module write clock WCLK_MDL is generated based on the input write clock WCLKJN signal. 
Assuming a read operation as shown in FIG. 4, data is transferred in this example from the 
5 second memory module 42B to the first memory module 42A in a right-to-left direction using the 
input read clock RCLK IN and output read clock RCLK_OUT for synchronized transfer of the 
read data DQ. Assuming data is to be read from the second memory module 42B to the first 
memory module 42 A, the data buffer 48 of the second memory module 42B outputs the read data 
DQ to the data buffer 48 of the first memory module 42A in synchronization with the output read 
10 clock RCLK_OUT signal. As described above, in this example, the output read clock 

RCLKJDUT is generated based on, the input write clock WCLKJN received by the second 
memory module 42B. A read operation for transferring data in the read direction from the first 
memory module 42A to the memory controller 40 operates in similar fashion. 
O Since, in this example, the output read clock RCLKOUT signal is generated in response 

HP to the input write clock WCLK_IN, the highest order memory module (in this case, the second 
^ memory module 42B) does not require an input read clock RCLKIN signal. Therefore, there is 
ilj no need for a separate source for the read clock signals RCLK in this embodiment. All write clock 
Q WCLK and read clock RCLK signals are generated based on the write clock signal WCLK 
|f{ generated at the memory controller 40. 

20 With reference to FIG. 5, during a write operation, data is transferred from the first 

memory module 42A to the second memory module 42B (and/or from the memory controller 40 
to the first memory module 42A) in a left-to-right direction. The data buffer 48 of the first memory 
module 42A receives write data DQ from the controller 40, in synchronization with the input write 
clock WCLK_JN signal. The data buffer 48 next determines whether to transmit the write data DQ 

25 to the memory devices DRAM 44 on the first memory module 42A , on the basis of the C/A 

decoding signal generated by the C/A buffer of the first memory module 42 A. If the data DQ is 
to be transferred to the second memory module 42B according to the C/A decoding signal, the data 
buffer 48 of the first memory module 42A transfers the received data DQ to the data buffer 48 of 
the second memory module 42B. The first memory module 42 A generates an output write clock 
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WCLKOUT signal based on the input write clock WCLK_IN signal, and the data DQ from the 
data buffer 48 is transferred from the first memory module 42 A to the second memory module 42A 
in synchronization with the output write clock WCLK OUT signal generated by the first memory 
module. The WCLK_OUT signal generated by the first memory module is received as the input 
5 write clock WCLKJN signal at the second memory module 42B for clocking with data 
transferred from the first memory module 42 A to the second memory module 42B. 

In this manner, a data buffer 48 of a given memory module 42A, 42B generates at least 
three clock signals; namely an output write clock WCLKOUT, an output read clock RCLKOUT 
and a module write clock WCLK MDL based on the input write clock signal WCLKJN. A PLL 
10 or DLL may be employed, for example, to generate the three clock signals in response to the input 
write clock WCLKJN signal. Furthermore, the data buffer 48 receives a module read clock signal 
RCLKJMDL from the memory device 44 in response to the module write clock signal 
WCLKMDL, and receives an input read clock RCLK IN from an adjacent module 42B. 
f3 Accordingly, the data buffer 48 in this example includes three clock domains. The first 

clock domain is determined by the input write clock signal WCLKJN received from an adjacent 
lower-order memory module, or memory controller.. The second clock domain is determined by 

III the module read clock signal RCLKJMDL received from the local memory devices 44. The third 

h\ 

I2| clock domain is determined by the input read clock signal RCLKJN received from an adjacent 
|2 higher-order memory module. 

20 By establishing that the data lines for data transfer in synchronization with a given clock 

are routed with the line for that clock, both on the motherboard connecting the memory modules 
and the memory controller, and also for the data lines routed on a given module, the present 
invention provides a suitable clock that is in phase with data for all data being transferred in the 
system. In other words, the data, and the associated clock, experience the same propagation path, 

25 and therefore have the same propagation delay. In view of this, the data and clock are received by 
the receiving unit in-phase and therefore the received clock can be used to sample the received data 
with high precision. This feature enhances overall system efficiency and reliability. 

In the example provided above, the lines carrying the data DQ signals between the memory 
controller 40 and the first module 42A, and the data DQ signals between the first module 42A and 
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the second module 42B are preferably routed with the lines of the corresponding WCLK and 
RCLK signals, as well as the lines of the corresponding control/address and DFLAG and RFLAG 
signals. Likewise, the lines carrying the data DQ signals between the data buffer 48 and a given 
memory device 44 are preferably routed with the lines of the corresponding module write clock 
5 WCLK_MDL and corresponding module read clock RCLK_MDL signals, to ensure that the data 
and clock are received by the receiving unit in synchronization with each other. 

The difference in phase between the first clock domain that is based on the input write 
clock signal and the second clock domain that is based on the received module write clock 
WCLK_MDL signal is the round-trip propagation delay for the module write clock WCLK_MDL 
10 and module read clock RCLK_MDL signals from the data buffer 48 to the memory device 44. 
M However, this round-trip delay is fixed by the physical design of the module, that is, by the 
q routing of the WCLKJVIDL and RCLKJMDL signals. Therefore the data buffer can readily 
!? transfer the data to and from each clock domain through simple clock domain crossing circuitry. 
|3 Clock domain crossing is used to transfer data received from the memory device 44 in 
|f synchronization with the module read clock signal RCLKJMDL at the data buffer 48 for providing 
read data to be transferred from the module in synchronization with the output read clock 

jtf RCLK_OUT signal. However, since the delay between the third and first clock domains is fixed, 

fit 

£ domain crossing is relatively easy, and data can therefore be transferred from the RCLKJMDL 
3 clock domain to the RCLK OUT clock domain. Another need for clock domain crossing in the 
20 data buffer 48 arises between the third clock domain based on the input read clock RCLK_IN and 
the output read clock RCLK OUT signal of the first clock domain (generated based on the input 
write clock WCLKJN signal) for transferring data during a read operation. The phase difference 
between the input read clock RCLK IN and the output read clock RCLKOUT in a given data 
buffer 48 is the round trip delay from one module to a neighboring module. Since this phase 
25 difference is constant or fixed, assuming the respective placements of the modules are such that the 
modules are at a fixed distance, such compensation is easy to handle. Assuming the phase 
difference between two clocks differs at the first module and the second module, the buffer should 
be able to handle this variable phase difference in order to transfer the data between the two clock 
domains. However, in the present invention, the phase difference between the input read clock 



o 
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RCLKIN and the output read clock RCLK_OUT is the same at all memory modules. Therefore, 
the buffer can easily handle the domain crossing. This is in contrast with conventional RAMBUS 
systems, wherein the phase difference between the forward clock and reverse clock (CTM, CFM) 
varies, according to the location of the memory device, such that memory devices in these 
5 systems require complex domain crossing circuitry. 

In the write direction, no domain crossing is needed, since the output write clock 
WCLK_OUT is generated based on the input write clock WCLKJN signal, and therefore share 
the same clock domain, namely, the first clock domain identified above. 

FIG. 6 is a schematic block diagram of a second embodiment of the present invention. In 
10 this embodiment, the output read clock signal RCLK__OUT is not generated by a given module 
based on the input write clock WCLKJN, as described above. Instead, the output read clock 
signal RCLKOUT is generated based on the received input read clock signal RCLKJN. The 
input read clock signal RCLK_IN is first received by the highest order memory module (in this 

w . 

O example, the second memory module 42B), as generated by a master read clock generator 50. The 
H|5 second memory module 42B (as well as the first memory module 42A) generates an output read 

L clock signal RCLK_OUT that is based on the input read clock RCLKIN signal, as described 

M 

fU above. 

flj 

H As shown in FIG. 7, during a read operation, data DQ is transferred from the second 

memory module 42B to the first memory module 42A, and from the first memory module 42A to 

20 the memory controller 40, in synchronization with the output read clock signal RCLK OUT that 
is generated in response to the corresponding input read clock signal RCLK_IN. The write 
operation for this embodiment is similar to that of the embodiment described above. Since the 
input read clock RCLKJN and output read clock RCLK_OUT share the same phase relationship, 
no clock domain crossing is required for these two signals. However, the phase relationship 

25 between the input write clock WCLKJN and input read clock RCLKIN signals varies depending 
on the position of a given module, since the write clock WCLK and read clock RCLK signals are 
generated at different sources, and propagate in opposite directions. Therefore, resolution of 
domain crossing in this configuration is very complicated. This configuration is conceptually 
similar to that of RAMBUS system. Assume there are 10 memory modules in the system. In this 
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case, the phase difference between the input write clock WCLK_IN signal and the input read clock 
RCLKIN signal is different at each memory module. The phase difference at the last module in 
the chain could be, for example, ten times that of the first module. The resulting phase difference 
at the last module can be greater than the clock cycle time, or even multiples of the clock cycle 
time. In this case, the buffer should include phase difference detection circuitry to avoid data 
transfer failures In the RAMBUS case, a training sequence is employed at the power-up stage to 
detect the phase difference between the CTM and CFM clocks. 

In this manner, the present invention provides clocking technique in a point-to-point 
memory system by which data, command and address signals are transferred between modules and 
between a module and memory controller in synchronization with suitable clock signals that 
experience the same propagation delay as the data signals. In addition, the clocking technique is 
simplified at each module by generating the output write clock WCLK_OUT, the module write 
clock WCLK_MDL in response to the input write clock WCLKJN and the module read clock 
RCLK_MDL in response to the module write clock WCLK_ MDL, and, in a preferred 
embodiment, the output read clock RCLKOUT in response to the input write clock WCLKJN. 

While this invention has been particularly shown and described with references to 
preferred embodiments thereof, it will be understood by those skilled in the art that various 
changes in form and details may be made herein without departing from the spirit and scope of the 
invention as defined by the appended claims. 



